{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/amr_stl.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Famr_stl.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IYwV-UkNNzFp"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1Uf_u7ddMhUt"
   },
   "outputs": [],
   "source": [
    "!pip install hanlp[amr] -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pp-1KqEOOJ4t"
   },
   "source": [
    "## 加载模型\n",
    "HanLP的工作流程是先加载模型，模型的标示符存储在`hanlp.pretrained`这个包中，按照NLP任务归类。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "4M7ka0K5OMWU",
    "outputId": "d74f0749-0587-454a-d7c9-7418d45ce534"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'AMR3_SEQ2SEQ_BART_LARGE': 'https://file.hankcs.com/hanlp/amr/amr3_seq2seq_bart_large_83.30_20220125_114450.zip',\n",
       " 'MRP2020_AMR_ENG_ZHO_XLM_BASE': 'http://download.hanlp.com/amr/extra/amr-eng-zho-xlm-roberta-base_20220412_223756.zip',\n",
       " 'MRP2020_AMR_ZHO_MENGZI_BASE': 'http://download.hanlp.com/amr/extra/amr-zho-mengzi-base_20220415_101941.zip'}"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import hanlp\n",
    "hanlp.pretrained.amr.ALL # 语种见名称最后一个字段或相应语料库"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "BMW528wGNulM"
   },
   "source": [
    "调用`hanlp.load`进行加载，模型会自动下载到本地缓存。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "0tmKBu7sNAXX",
    "outputId": "df2de87b-27f5-4c72-8eb2-25ceefdd8270"
   },
   "outputs": [],
   "source": [
    "amr = hanlp.load('MRP2020_AMR_ENG_ZHO_XLM_BASE')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 抽象意义表示\n",
    "抽象意义表示任务的输入为一个或多个句子，`MRP2020_AMR_ENG_ZHO_XLM_BASE`要求提供分词完毕的句子："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "BqEmDMGGOtk3",
    "outputId": "936d439a-e1ff-4308-d2aa-775955558594"
   },
   "outputs": [],
   "source": [
    "graph = amr([\"男孩\", \"希望\", \"女孩\", \"相信\", \"他\", \"。\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jj1Jk-2sPHYx"
   },
   "source": [
    "返回对象为[penman.Graph](https://penman.readthedocs.io/en/latest/api/penman.graph.html)类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AMRGraph object (top=x2) at 12603529872>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "graph"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "打印时为友好格式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(x2 / 希望-01\n",
      "    :arg1 (x4 / 相信-01\n",
      "              :arg0 (x3 / 女孩)\n",
      "              :arg1 x1)\n",
      "    :arg0 (x1 / 男孩))\n"
     ]
    }
   ],
   "source": [
    "print(graph)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "该AMR的可视化结果为：\n",
    "\n",
    "![amr-zh](https://hanlp.hankcs.com/backend/v2/amr_svg?tokens=%E7%94%B7%E5%AD%A9%20%E5%B8%8C%E6%9C%9B%20%E5%A5%B3%E5%AD%A9%20%E7%9B%B8%E4%BF%A1%20%E4%BB%96%20%E3%80%82&language=zh&scale=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`MRP2020_AMR_ENG_ZHO_XLM_BASE`其实是一个Meaning Representation Parsing模型，支持输出Meaning Representation（MR）格式，该格式比AMR的表达力更强："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'id': '0',\n",
       " 'input': '男孩 希望 女孩 相信 他 。',\n",
       " 'nodes': [{'id': 0,\n",
       "   'label': '男孩',\n",
       "   'anchors': [{'from': 0, 'to': 2}, {'from': 12, 'to': 13}]},\n",
       "  {'id': 1, 'label': '希望-01', 'anchors': [{'from': 3, 'to': 5}]},\n",
       "  {'id': 2, 'label': '女孩', 'anchors': [{'from': 6, 'to': 8}]},\n",
       "  {'id': 3, 'label': '相信-01', 'anchors': [{'from': 9, 'to': 11}]}],\n",
       " 'edges': [{'source': 1, 'target': 3, 'label': 'arg1'},\n",
       "  {'source': 1, 'target': 0, 'label': 'arg0'},\n",
       "  {'source': 3, 'target': 2, 'label': 'arg0'},\n",
       "  {'source': 3, 'target': 0, 'label': 'arg1'}],\n",
       " 'tops': [1],\n",
       " 'framework': 'amr'}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "amr([\"男孩\", \"希望\", \"女孩\", \"相信\", \"他\", \"。\"], output_amr=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意上面“男孩”有2个anchor，分别对应“男孩”和“他”。也就是说，MR格式其实包含了指代消解的结果。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 多语种支持\n",
    "`MRP2020_AMR_ENG_ZHO_XLM_BASE`同时还是一个Cross-Lingual模型，支持的语言列表："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['amr', 'eng'], ['amr', 'zho']]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "amr.config.frameworks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用户可以通过指定language参数来实现英文抽象意义表示的分析："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(w1 / wants-01\n",
      "    :arg1 (b2 / believe-01\n",
      "              :arg0 (g1 / girl)\n",
      "              :arg1 b1)\n",
      "    :arg0 (b1 / boy))\n"
     ]
    }
   ],
   "source": [
    "print(amr(['The', 'boy', 'wants', 'the', 'girl', 'to', 'believe', 'him', '.'], language='eng'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "为了达到最佳效果，建议同时提供每个词的词干："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(w1 / want-01\n",
      "    :arg1 (b2 / believe-01\n",
      "              :arg0 (g1 / girl)\n",
      "              :arg1 b1)\n",
      "    :arg0 (b1 / boy))\n"
     ]
    }
   ],
   "source": [
    "print(amr([('The', 'the'), ('boy', 'boy'), ('wants', 'want'), ('the', 'the'), ('girl', 'girl'), ('to', 'to'),\n",
    "              ('believe', 'believe'), ('him', 'he'), ('.', '.')], language='eng'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "该AMR的可视化结果为：\n",
    "\n",
    "![amr-en](https://hanlp.hankcs.com/backend/v2/amr_svg?tokens=The%20boy%20wants%20the%20girl%20to%20believe%20him%20.&language=en&scale=1)"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "collapsed_sections": [],
   "name": "amr_stl.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
